Motivation, introduction and explanation

The goal of this exerciser is to solidify and practically apply data processing skills discussed in the class. We attempt to cover some components of the course in this exercise.

Your task is to (at least) reproduce results of econometric analysis shown below based on the pre-defined data, sets of requirements and only R Programming skills (without manual copy-paste) in R Markdown.

One of the objective that we try to achieve by using R, RStudio, Rmarkdown and other complementary tools for executing this project is to trin the best practices and key concepts of Reproducible research with R and RStudio1 and literate programming.

Be accurate! As econometric analysis is not only about getting numbers right but also about communicating results clearly, this project scrutinizes the data manipulation skills to the limit (possibly). It does not mean that one need to specify all 16 digits after comma2 in tables. However, this means that you need to cherry pick what you display and what you don’t, how you label plots’ axes and where you leave them unnamed, what explanation you provide in the tables footnotes and how you call different columns.

Do not be deterred… The key point of this exercise (also reflected in the bulk of the total grade for this work) is to reproduce actual results, while the communication details account for about one fifth of the final grade. However, learning R often happens when one attempts to clarify details and makes the story straight and clear for the reader.

You are welcome to improve when preparing your homework! Please, do demonstrate what you have learned in data analysis and econometrics, what packages you’ve discovered, mastered and used, what improvements you’ve included and developed. You are not limited by the templates presented below. Any innovation may improve your grade3.

Basic rules:

Problem Setting

Nearly reproduce “Table 1 - Descriptive Statistics” from (Acemoglu et al. 2001) using data set Acemoglu2001.csv. Unfortunately, as mentioned by the author, it is not possible to reproduce exact the same numbers as they are in the paper. However, using the data set provided with this exercise, one can reproduce exact same data tables and plots as below.

Before, reproducing data from the table, make sure that you do follow all data cleaning steps:

## Write code for data loading here
Acemoglu2001 <- read_csv("data/Acemoglu2001.csv")
## Rows: 164 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): iso
## dbl (9): base_sample, africa, asia, other, pgp95, hjypl, avexpr, extmort4, l...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#
#

In the data set, there are following variables:

Problem 1. Descriptive statistics (15 points)

Produce a table with descriptive statistics (mean and standard deviation) for all continuous variables present in the sample and their logarithm transformations computed at the stage of data cleaning (continuous variables are transformed with a natural logarithm). Compute this summary statistics for entire sample of 164 countries as well as for a sub-sample of the base countries.

dta_clean <- read_csv("data/Acemoglu2001.csv")
## Rows: 164 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): iso
## dbl (9): base_sample, africa, asia, other, pgp95, hjypl, avexpr, extmort4, l...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Develop R code here...
dta_clean<-slice(Acemoglu2001,-164)
#Replacing NANs
cleaned_dta<-dta_clean%>%
  mutate(pgp95=replace(pgp95,is.na(pgp95),mean(pgp95,na.rm=TRUE)),
  hjypl=replace(hjypl,is.na(hjypl),mean(hjypl,na.rm=TRUE)),
  avexpr=replace(avexpr,is.na(avexpr),mean(avexpr,na.rm=TRUE)),
  extmort4=replace(extmort4,is.na(extmort4),mean(extmort4,na.rm=TRUE)))

cleaned_log<-cleaned_dta%>%
  mutate(pgp95=log(pgp95),
         hjypl=log(hjypl),
         extmort4=log(extmort4))
glimpse(cleaned_log)
## Rows: 163
## Columns: 10
## $ iso         <chr> "AFG", "AGO", "ARE", "ARG", "ARM", "AUS", "AUT", "AZE", "B…
## $ base_sample <dbl> 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1…
## $ africa      <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0…
## $ asia        <dbl> 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0…
## $ other       <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ pgp95       <dbl> 8.862889, 7.770645, 9.804219, 9.133459, 7.682482, 9.897972…
## $ hjypl       <dbl> -1.22811001, -3.41124773, -1.22811001, -0.87227380, -1.228…
## $ avexpr      <dbl> 7.066491, 5.363636, 7.181818, 6.386364, 7.066491, 9.318182…
## $ extmort4    <dbl> 4.540098, 5.634790, 5.397830, 4.232656, 5.397830, 2.145931…
## $ lat_abst    <dbl> 0.36666667, 0.13666667, 0.26666668, 0.37777779, 0.44444445…
#Mean, Log and standard deviation
table_means<-colMeans(cleaned_dta[ ,c("pgp95","hjypl","avexpr","extmort4")])
table_means
##        pgp95        hjypl       avexpr     extmort4 
## 7064.8658759    0.2928455    7.0664914  220.9264365
#Descriptive statistics for full sample
sumtable(cleaned_dta, summ=c("mean(pgp95)","sd(pgp95)"))
Summary Statistics
Variable Mean(pgp95) Sd(pgp95)
base_sample NA NA
africa NA NA
asia NA NA
other NA NA
pgp95 NA NA
hjypl NA NA
avexpr NA NA
extmort4 NA NA
lat_abst NA NA
sumtable(cleaned_dta)
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
base_sample 163 0.393 0.49 0 0 1 1
africa 163 0.307 0.463 0 0 1 1
asia 163 0.258 0.439 0 0 1 1
other 163 0.025 0.155 0 0 0 1
pgp95 163 7064.866 6984.291 450 1665 8624.998 29399.992
hjypl 163 0.293 0.234 0.029 0.108 0.323 1
avexpr 163 7.066 1.553 1.636 6.375 7.727 10
extmort4 163 220.926 299.819 2.55 81.6 220.926 2940
lat_abst 162 0.296 0.19 0 0.144 0.447 0.722
glimpse(sumtable)
## function (data, vars = NA, out = NA, file = NA, summ = NA, summ.names = NA, 
##     add.median = FALSE, group = NA, group.long = FALSE, group.test = FALSE, 
##     group.weights = NA, col.breaks = NA, digits = NA, fixed.digits = FALSE, 
##     factor.percent = TRUE, factor.counts = TRUE, factor.numeric = FALSE, 
##     logical.numeric = FALSE, logical.labels = c("No", "Yes"), labels = NA, 
##     title = "Summary Statistics", note = NA, anchor = NA, col.width = NA, 
##     col.align = NA, align = NA, note.align = "l", fit.page = "\\textwidth", 
##     simple.kable = FALSE, opts = list())
summarise(cleaned_dta,obs=n(), sd_pgp95=sd(pgp95, na.rm = TRUE), 
sd_hjypl =sd(hjypl, na.rm = TRUE),
sd_avexpr= sd(avexpr, na.rm = TRUE), 
sd_extmort4=sd(extmort4, na.rm = TRUE),
sd_lat_abst = sd(lat_abst, na.rm = TRUE))
glimpse(cleaned_dta)
## Rows: 163
## Columns: 10
## $ iso         <chr> "AFG", "AGO", "ARE", "ARG", "ARM", "AUS", "AUT", "AZE", "B…
## $ base_sample <dbl> 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1…
## $ africa      <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0…
## $ asia        <dbl> 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0…
## $ other       <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ pgp95       <dbl> 7064.8659, 2369.9998, 18109.9945, 9259.9978, 2169.9996, 19…
## $ hjypl       <dbl> 0.2928455, 0.0330000, 0.2928455, 0.4180000, 0.2928455, 0.8…
## $ avexpr      <dbl> 7.066491, 5.363636, 7.181818, 6.386364, 7.066491, 9.318182…
## $ extmort4    <dbl> 93.7000, 280.0000, 220.9264, 68.9000, 220.9264, 8.5500, 22…
## $ lat_abst    <dbl> 0.36666667, 0.13666667, 0.26666668, 0.37777779, 0.44444445…
#
base_sample_stat<-filter(cleaned_dta, base_sample ==1)
base_sample1<-base_sample_stat[, 6:10]
sumtable(base_sample1)
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
pgp95 64 5445.458 6327.345 450 1480 6967.501 27329.998
hjypl 64 0.23 0.223 0.029 0.066 0.293 1
avexpr 64 6.516 1.469 3.5 5.614 7.352 10
extmort4 64 245.911 472.624 8.55 68.9 240 2940
lat_abst 64 0.181 0.133 0 0.089 0.258 0.667
glimpse(sumtable)
## function (data, vars = NA, out = NA, file = NA, summ = NA, summ.names = NA, 
##     add.median = FALSE, group = NA, group.long = FALSE, group.test = FALSE, 
##     group.weights = NA, col.breaks = NA, digits = NA, fixed.digits = FALSE, 
##     factor.percent = TRUE, factor.counts = TRUE, factor.numeric = FALSE, 
##     logical.numeric = FALSE, logical.labels = c("No", "Yes"), labels = NA, 
##     title = "Summary Statistics", note = NA, anchor = NA, col.width = NA, 
##     col.align = NA, align = NA, note.align = "l", fit.page = "\\textwidth", 
##     simple.kable = FALSE, opts = list())
#Descriptive statistics after log transformation
base_sample_statis<-filter(cleaned_log, base_sample==0)
base_sample2<-base_sample_statis[, ! names(cleaned_log)%in% c("base_sample","africa","asia","other")]
sumtable(base_sample2)
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
pgp95 99 8.543 1.042 6.461 7.81 9.349 10.289
hjypl 99 -1.418 0.919 -3.54 -1.554 -0.892 -0.014
avexpr 99 7.423 1.508 1.636 7.066 8.205 10
extmort4 99 5.172 0.796 0.936 5.398 5.398 6.18
lat_abst 98 0.37 0.186 0.011 0.207 0.519 0.722
glimpse(sumtable(base_sample2))
##  'kableExtra' chr "<table class=\"table\" style=\"margin-left: auto; margin-right: auto;\">\n<caption>Summary Statistics</caption>"| __truncated__
##  - attr(*, "format")= chr "html"

Note: numbers in the columns correspond to means, standard deviations are reported in the parentheses.

If there are difficulties with reproducing exact table, use other conventional functions for the summary statistics in order to print same summary statistics in separate tables or text output. For example:

Problem 2. Means by quartiles (15 points)

Use the same data to calculate means of the above mentioned contentious variables by quartiles of mortality for the base sample only. Quartiles of mortality are:

  1. for mortality less than 65.4;

  2. greater than or equal to 65.4 and less than 78.1;

  3. greater than or equal to 78.1 and less than 280;

  4. greater than or equal to 280;

## Develop R code here...
#for mortality less than 65.4
Mortality<-base_sample1%>%
filter(extmort4<=65.4)
summarise(Mortality, obs=n(), mean = colMeans(Mortality, na.rm = TRUE),
         log =log(mean)) 
#Mortality greater than or equal to 65.4 and less than 78.1
Mortality1<-filter(base_sample1, extmort4 >=65.4 & extmort4<78.1 )
summarise(Mortality1, obs=n(), mean = colMeans(Mortality1, na.rm=TRUE),
          log =log(mean))
#Mortality greater than or equal to 78.1 and less than 280
Mortality2<-filter(base_sample1, extmort4 >=78.1 & extmort4<280)
summarise(Mortality2, obs=n(), mean = colMeans(Mortality2, na.rm=TRUE),
          log =log(mean))
#Mortality greater than or equal to 280;
Mortality3<-filter(base_sample1, extmort4 >=280)
summarise(Mortality3, obs=n(), mean = colMeans(Mortality3, na.rm=TRUE),log =log(mean))
#
#

Problem 3. Visual inspection of the categorical and continuous data (15 points)

Develop two box plots of the GDP per capita and European settler mortality by regions (Africa, Asia, Others and Latin America) for the base sample only.

##box plot for GDP per capita
Dataframe<-select(cleaned_dta, africa:pgp95)
africa_dt<-filter(Dataframe, africa==1)
asia_dt<-filter(Dataframe, asia==1)
other_dt<-filter(Dataframe,other==1)
Latin_America_dt<-filter(Dataframe,africa==0 & asia==0 & other==0)
dt_boxplot<-rbind(africa_dt, asia_dt, other_dt, Latin_America_dt)
GDP<-mutate(dt_boxplot, region= africa +asia +other)
#Data distribution in GDP Dataset
attach(GDP)
GDP[51:92,5]<-2
GDP[93:96,5]<-4
GDP[97:163,5]<-3
New_GDP<-GDP
class(New_GDP$region)
## [1] "numeric"
head(New_GDP$region,53)
##  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [39] 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2
as.factor(New_GDP$rerion)
## Warning: Unknown or uninitialised column: `rerion`.
## factor(0)
## Levels:
New_GDP$region[New_GDP$region==1]<-"Africa"
New_GDP$region[New_GDP$region==2]<-"Asia"
New_GDP$region[New_GDP$region==3]<-"Latin.America"
New_GDP$region[New_GDP$region==4]<-"Other"
GDP_per_capita_1995<-pgp95
boxplot(GDP_per_capita_1995~region, data = New_GDP,col=rainbow(4))

#box plot for mortality rate
Dataframe1<-cleaned_dta[, ! names(cleaned_dta) %in%
c("base_sample","iso","pgp95","hjypl","avexpr","lat_abst")]
africa1_dta<-filter(Dataframe1, africa==1)
asia1_dta<-filter(Dataframe1, asia==1)
other1_dta<-filter(Dataframe1,other==1)
Latin_America1_dta<-filter(Dataframe1,africa==0 & asia==0 & other==0)
Mortality_rate<-rbind(africa1_dta, asia1_dta, other1_dta, Latin_America1_dta)
New_Mortality_rate<-mutate(Mortality_rate, region= africa +asia +other)
attach(New_Mortality_rate)
## The following objects are masked from GDP:
## 
##     africa, asia, other, region
New_Mortality_rate[51:92,5]<-2
New_Mortality_rate[93:96,5]<-4
New_Mortality_rate[97:163,5]<-3
Plot_mortality<-New_Mortality_rate
as.factor(Plot_mortality$region)
##   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
##  [75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
## [112] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
## [149] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
## Levels: 1 2 3 4
Plot_mortality$region[Plot_mortality$region==1]<-"africa"
Plot_mortality$region[Plot_mortality$region==2]<-"asia"
Plot_mortality$region[Plot_mortality$region==3]<-"L.America"
Plot_mortality$region[Plot_mortality$region==4]<-"other"
European_settler_Mortality<-extmort4
boxplot(European_settler_Mortality~region, data=Plot_mortality, col=rainbow(4))

Problem 4. T-test about the difference between groups (5 points)

Filter two regions: Africa and Latin America and compare means of mortality rates between these two groups. Perform a t-test to compare the means.

## Develop R code here…
Data_frame<-select(cleaned_dta, africa:extmort4)
AFRICA<-filter(Data_frame, africa==1)
ASIA<-filter(Data_frame, asia==1)
OTHER<-filter(Data_frame,other==1)
L_America<-filter(Data_frame,africa==0 & asia==0 & other==0)
Continet<-rbind(AFRICA, ASIA, OTHER, L_America)
Continet1<-mutate(Continet,  region= africa +asia +other)
attach(Continet1)
## The following objects are masked from New_Mortality_rate:
## 
##     africa, asia, extmort4, other, region
## The following objects are masked from GDP:
## 
##     africa, asia, other, pgp95, region
Continet1[51:92,8]<-2
Continet1[93:96,8]<-4
Continet1[97:163,8]<-3
New_Continent<-Continet1
as.factor(New_Continent$region)
##   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
##  [75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
## [112] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
## [149] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
## Levels: 1 2 3 4
New_Continent$region[New_Continent$region==1]<-"africa"
New_Continent$region[New_Continent$region==2]<-"asia"
New_Continent$region[New_Continent$region==3]<-"L.America"
New_Continent$region[New_Continent$region==4]<-"other"
Africa_mortality<-filter(New_Continent, region == "africa" )
L_America_mortality<-filter(New_Continent, region== "L.America")
t.test(Africa_mortality$extmort4,L_America_mortality$extmort4)
## 
##  Welch Two Sample t-test
## 
## data:  Africa_mortality$extmort4 and L_America_mortality$extmort4
## t = 2.927, df = 50.737, p-value = 0.005111
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   65.78114 353.16888
## sample estimates:
## mean of x mean of y 
##  366.2865  156.8115
#
#

Explain in your own words, what are the causal implication of the observed differences? In other words, if we observe any difference between the regions, does it mean that different regions cause difference in income and mortality rates?

Write your answer here: … The outcome shows a more than 2 times higher mortality rate in Africa (366.2865) than Latin America (156.8115). These may be due to low levels of gdp in Africa compared to Latin America which positively collerate with mortality rates. Low gdp may limit growth of the health system and food sector. However, it is not true that different regions cause difference in income and mortality rates. Some mortalities are caused by epidemics such as weather, and even pandemics such as COVID 19 which cut across continents. ## Problem 5. Relationship between continuous variables (10 points)

Develop three scatter plots similar to figures 1, 2 and 3 in (Acemoglu et al. 2001) and compute corresponding correlation coefficients. Please note, correlation should not necessarily be plotted and could be printed in a table instead.

## Develop R code here...
pairs(~pgp95+avexpr+extmort4, data = cleaned_dta,
   main="Simple Scatterplot Matrix")

#
#

Problem 6. Reproduce OLS regressions 1-6 from the Table 2 in (Acemoglu et al. 2001) (20 points)

## Develop R code here...
regress<-lm(pgp95~ avexpr + other + lat_abst+asia+ africa, data =cleaned_dta)
summary(regress)
## 
## Call:
## lm(formula = pgp95 ~ avexpr + other + lat_abst + asia + africa, 
##     data = cleaned_dta)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -8723  -3600  -1036   2480  13600 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -10576.8     2247.5  -4.706 5.53e-06 ***
## avexpr        2529.2      309.1   8.182 9.21e-14 ***
## other         1853.2     2591.9   0.715   0.4757    
## lat_abst      3090.0     2675.3   1.155   0.2499    
## asia         -1133.1     1020.9  -1.110   0.2688    
## africa       -2817.8     1120.2  -2.516   0.0129 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5001 on 156 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.5062, Adjusted R-squared:  0.4904 
## F-statistic: 31.99 on 5 and 156 DF,  p-value: < 2.2e-16
#
#

Problem 7. Reproduce IV regressions 1, 2, 7, and 8 from the Table 4 in (Acemoglu et al. 2001) (20 points)

## Develop R code here...
#install.packages("ivreg, dependencies =TRUE")
cleaned_dta<-cleaned_dta%>%
mutate(logpgp95=log(pgp95),
logextmort4=log(extmort4),
loghjypl=log(hjypl))
cleaned_dta<-cleaned_dta[, ! names(cleaned_dta)%in% c("pgp95","loghjypl")] 
regress<-lm(logpgp95~ avexpr+ lat_abst+asia+ africa + 
other+logextmort4,data=cleaned_dta)
summary(regress)
## 
## Call:
## lm(formula = logpgp95 ~ avexpr + lat_abst + asia + africa + other + 
##     logextmort4, data = cleaned_dta)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.65317 -0.50674  0.02696  0.42997  2.19663 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  7.63071    0.48878  15.612  < 2e-16 ***
## avexpr       0.25203    0.04770   5.283 4.24e-07 ***
## lat_abst     0.69596    0.41858   1.663  0.09840 .  
## asia        -0.32328    0.15588  -2.074  0.03975 *  
## africa      -0.75355    0.17849  -4.222 4.12e-05 ***
## other       -0.16774    0.42349  -0.396  0.69258    
## logextmort4 -0.19030    0.06807  -2.796  0.00583 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7611 on 155 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.511,  Adjusted R-squared:  0.492 
## F-statistic: 26.99 on 6 and 155 DF,  p-value: < 2.2e-16
#
#

References

Acemoglu, Daron, Simon Johnson, and James A Robinson. 2001. “The Colonial Origins of Comparative Development: An Empirical Investigation.” American Economic Review 91 (5): 1369–1401. https: //doi.org/10.1257/aer.91.5.1369


  1. To start with reproducible research using RMarkdown, see Chapter 26 R4DS. More comprehensive guide on reproducible research with R and its programming side is R Markdown: The Definitive Guide. Finally, the definitive guide is the book Reproducible research with R and RStudio↩︎

  2. As you may know, computers are quite inaccurate when it comes to computations with a floating point (non integer numbers). Usually, just after 16 digits after comma, computer returns random numbers. See, for example “Circle 1. Falling into the Floating Point Trap” (pp. 9-11) in R Inferno or read more about it in the Floating-point arithmetic article. Or guess what R would return for this comparison .1 == .3/3↩︎

  3. Unfortunately, there are reasonable limitations to what extent the grade could be improved. There is a maximum 100%, which cannot be exceeded and 80% of the grade is still about reproducing statistical results from the template solution, which cannot be compensated by the improved theam for plots or tables.↩︎